Preparing for Production
Working example: credit scoring system:
Regulatory or legal requirements;
Understand your model and convince stakeholders to use it;
Justify decisions to individual customers.
Example: Decision Tree
Repeatedly partition covariate space
Mimics human decision making
Medical triage optimises for speed
Usually optimise for best classifier
Trade-off flexibility & robustness for an explanation.
Height no longer significant, sign of point estimate changed.
Effect and interpretation depend on which covariates are included.
SHAP averages over all combinations.
Trend disappears or reverses when groups are split / combined.
Lots of other names, including Ecological Fallacy.
Not actually a paradox at all
Shuffle covariate values to remove any relationship & inspect how predictions change.
Construct an explainable model to describe the local behaviour of a model that is not explainable.
Methods like LIME use a linear meta-model motivated by Taylor’s Theorem.
Can move from conditional to marginal explanations and from local to global explanations.
Requires integration over \(f(x)\), which is unknown.
Use empirical distribution instead and this simplifies to a sum!
Trade-off between complexity and explainability.
Conditional effects can be tricky to explain.
Approximate more complex models to get localised explanation.
Aggregate local, conditional effects to get global, marginal effects.
Effective Data Science: Production - Explainability - Zak Varty